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Abstract 

This  research  focused  on  the  development  developing  of  theoretical  and  algorithmic 
foundations  for  an  applicable  theory  of  cooperative  mission  control  for  groups  of 
heterogeneous,  distributed  UAVs.  The  research  is  motivated  by  the  problem  of 
coordination  of  activities  among  UAVs  in  adaptive  response  to  unknown  events.  The 
main  results  of  the  work  include:  (1)  new  techniques  for  the  solution  of  adaptive  search 
and  sensor  management,  leading  to  solution  of  large  scale  combinatorial  optimization 
problems  in  stochastic,  dynamic  environments,  based  on  integration  of  stochastic  control 
and  discrete  optimization  techniques;  (2)  distributed  control  techniques  for  trajectories  of 
agents  performing  search  and  tracking  while  having  to  maintain  communications 
connectivity;  (3)  cooperative  control  techniques  for  mission  management  involving 
rendezvouz  problems  of  multiple  agents  performing  tasks;  (4)  distributed  algorithms  for 
nonlinear  resource  allocation  problems  to  agents;  and  (5)  combinatorial  algorithms  for 
managing  connectivity  of  air-to-air  low  directional  communication  networks.  These 
results  provide  new  models  and  algorithms  for  cooperative  control  that  increase  the  level 
of  autonomy  that  can  be  provided  to  UAVs,  thereby  enhancing  the  U.  S.  Air  Force's 
capability  to  use  unmanned  vehicles  without  requiring  large  numbers  of  human  operators. 

1.  Introduction 

In  Joint  Vision  2010,  the  Chairman  of  the  Joint  Chiefs  of  Staff  outlined  a  vision  of 
effective,  efficient  armed  services  for  the  next  century.  This  document  stressed  the 
importance  of  "information  superiority"  which  he  defined  as  "The  capability  to  collect, 
process,  and  disseminate  an  uninterrupted  flow  of  information  while  exploiting  or 
denying  and  adversary's  ability  to  do  the  same."  The  document  identified  the  tactical  use 
of  autonomous  assets  as  a  key  technology  that  will  enable  the  commanders  to  obtain 
superior  information,  and  conduct  dominant  maneuvers  and  precision  engagements  in  an 
increasingly  hostile  environment.  Information  Superiority  remains  a  key  component  of 
Joint  Vision  2020  in  order  “to  achieve  decision  superiority,  to  support  advanced 


command  and  control  capabilities,  and  to  reach  the  full  potential  of  dominant  maneuver, 
precision  engagement,  full  dimensional  protection,  and  focused  logistics.” 

In  recent  years,  the  United  States  has  been  involved  in  military  conflicts  in  Afghanistan 
and  Iraq  that  saw  increased  use  of  remotely-piloted  tactical  aircraft  such  as  Predators, 
Reapers  and  Global  Hawks  for  surveillance  and  target  prosecution.  These  vehicles 
enable  the  Air  Force  to  accomplish  missions  that  may  be  difficult  for  manned  aircraft  due 
to  survivability  or  other  reasons.  These  missions  include  Suppression  of  Enemy  Air 
Defenses  (SEAD),  moving  target  attack,  fixed  target  attack.  Intelligence  Surveillance  and 
Reconnaissance  (ISR),  jamming,  theater  missile  defense,  and  counter  weapons  of  mass 
destruction.  UAV  prototypes  have  demonstrated  so  far  the  capability  for  unmanned 
flight. 

One  of  the  major  limitations  of  current  tactical  unmanned  air  vehicles  (UAVs)  is  the  high 
level  of  human  control  required  to  operate  a  single  vehicle.  This  places  limits  on  the  use 
of  large  numbers  of  autonomous  assets  acting  in  a  coordinated  manner.  To  achieve  the 
full  potential  of  unmanned  tactical  vehicles,  many  of  the  decisions  made  by  human 
operators  will  be  generated  by  an  intelligent  cooperative  control  system,  that  will  be 
guided  by  objectives  and  constraints  provided  by  human  operators.  This  cooperative 
control  will  have  to  solve  complex  dynamic  decision  problems  associated  with  mission 
planning  and  control,  in  an  unstructured  and  uncertain  environment,  in  near  real  time,  and 
without  human  intervention,  adapting  in  an  intelligent  fashion  to  events  which  arise  in  a 
hostile,  uncertain  and  rapidly  evolving  environment.  This  requires  new  technology  for 
cooperative,  distributed  mission  control ,  capable  of  selecting  and  coordinating  the 
activities  of  multiple  heterogeneous  platforms  to  achieve  a  common  objective. 

This  report  summarizes  the  results  of  our  investigations  towards  the  development  of 
theoretical  foundations  for  an  applicable  theory  of  cooperative,  distributed  mission 
control  for  teams  of  heterogeneous  UAVs,  based  on  the  application  of  distributed 
optimization  models  and  techniques.  Our  results  focus  on  mathematical  models  that 
abstract  classes  of  problems  associated  with  missions  conducted  by  unmanned  air 
vehicles,  and  develop  algorithms  to  compute  optimal  or  near-optimal  decisions  for  these 
models,  guided  by  inputs  from  human  operators  concerning  desirability  of  outcomes. 
These  algorithms  provided  the  foundations  for  increased  automation  in  the  operation  of 
unmanned  air  vehicles,  and  will  allow  the  U.  S.  Air  Force  to  enhance  significantly  its 
capability  to  employ  unmanned  vehicles  without  relying  on  large  numbers  of  human 
operators  to  control  and  coordinate  the  individual  vehicles. 

The  rest  of  this  report  is  organized  as  follows:  The  next  section  provides  an  overview  of 
the  principal  results  of  this  research.  Section  3  summarizes  the  personnel  supported 
under  this  grant.  Section  4  contains  the  list  of  publications  that  document  the  results. 


2.  Principal  Results 


In  this  Section,  we  present  the  main  results  of  our  work.  The  work  consisted  of  detailed 
exploration  of  different  decision  problems  associated  with  cooperative  control  of  teams 
of  unmanned  vehicles  conducting  missions  in  uncertain  environments. 

A.  Distributed  Control  and  Optimization  in  Energy  Limited  Cooperative 
Systems 

Many  modem  optimization  and  control  tasks  can  only  be  accomplished  by  deploying  a 
distributed  cooperative  system,  which  consists  of  geographically  distributed  agents 
working  on  missions  that  require  their  combined  efforts,  with  little  or  no  central 
coordination.  Our  research  focused  on  the  problem  of  deploying  and  moving  sensors  to 
achieve  coverage  control  in  order  to  maximize  the  detection  probability  of  random 
events,  taking  into  account  the  discontinuities  introduced  by  obstacles,  the  limited 
sensing  field  of  view  of  the  sensors,  and  limits  imposed  in  maintaining  communications 
connectivity  to  other  sensors  [13],[20],[22].  Our  results  developed  a  distributed  gradient- 
based  scheme  that  uses  only  local  information  available  to  each  distributed  agent  about 
its  location  and  its  neighbors’  locations.  We  also  developed  a  modified  problem 
formulation  with  a  different  objective  function  which  provides  a  more  balanced  coverage 
of  the  mission  space  when  necessary. 

We  extended  the  results  above  to  missions  that  search  for  new  events,  while  maintaining 
continuous  collection  of  previously  observed  events,  motivated  by  the  application  of 
detection  and  tracking  vehicles  [39], [42].  We  process  the  information  generated  by  event 
detections  using  Bayesian  techniques,  to  estimate  recursively  the  locations  of  potential 
events.  Once  a  set  of  high  occupancy  probability  locations  are  identified,  we  develop  a 
joint  optimization  problem  incorporating  both  coverage  and  data  collection  requirements 
as  objectives,  and  solve  these  in  a  distributed  manner.  The  results  were  demonstrated  in  a 
robotic  test  bed  as  well  as  in  simulation  [39]. 

As  a  related  problem,  we  developed  an  event-driven  communication  scheme  to  solve  the 
problem  of  how  and  when  agents  should  communicate  in  order  to  make  their  information 
exchange  more  efficient  and  thus  save  energy.  Specifically,  we  formulate  the  problem  as 
an  optimization  problem  where  multiple  agents  must  choose  their  individual  decisions  in 
order  to  optimize  a  common  objective  while  communicating  with  each  other  to  exchange 
updated  information.  We  obtain  conditions  under  which  the  optimization  process 
converges  with  asynchronous  communication  of  state  information  among  agents.  We 
apply  this  asynchronous  (event-driven)  approach  to  the  coverage  control  problem  and 
numerically  show  that  it  substantially  reduces  energy  consumption  while  preserving  the 
same  performance  as  a  synchronous  algorithm  [19], [31]. 


B.  Cooperative  Mission  Control  for  Multi-UAV  Rendez-vous  Problems 

We  studied  missions  where  UAVs  must  visit  multiple  targets  and  obtain  rewards 
associated  with  each  target  with  the  added  requirement  that  two  or  more  UAVs  must  be 
present  in  the  vicinity  of  each  target  in  order  to  collect  the  associated  reward.  The  mission 


setting  is  stochastic  with  targets  possibly  appearing  or  disappearing  in  real  time.  We 
developed  a  Cooperative  Receding  Horizon  (CRH)  controller  to  maximize  the  total 
reward  obtained  in  a  finite  mission  time  horizon.  In  this  approach  (based  on  some  of  our 
earlier  work  under  this  grant)  we  control  the  motion  so  as  to  maximize  the  expected 
rewards  over  a  planning  window  without  explicit  target  assignment  [41].  In  particular,  the 
receding  horizon  controller  computes  the  optimal  headings  of  the  UAVs  at  the  end  of 
each  planning  horizon,  such  that  the  total  expected  reward  obtained  by  the  team  is 
maximized  (assuming  no  target  emerges  on  the  way  during  the  planning  horizon.)  This 
heading  is  then  executed  for  a  shorter  action  horizon,  unless  a  new  target  is  detected,  in 
which  case  the  optimization  is  performed  again.  If  there  is  no  new  target,  then  the  optimal 
headings  will  be  recomputed  at  the  end  of  the  action  horizon.  Even  though  no  explicit 
task  assignment  is  performed,  we  were  able  to  show  that  the  proposed  controller  drives 
UAVs  to  targets.  Hence,  the  trajectory  generated  by  the  controller  is  “stationary”  in  the 
sense  that  the  vehicles  converge  to  some  targets  even  though  the  control  decision  is  made 
in  real  time  with  no  explicit  assignment  involved.  This  convergence  result  was  formally 
established  for  2  UAVs  and  a  single  target  [41].  Furthermore,  this  controller  integrates 
the  problems  of  task  assignment  and  trajectory  generation,  both  in  an  uncertain 
environment.  To  evaluate  the  quality  of  our  CRH  algorithm  as  to  the  objective  of  reward 
collection  maximization,  we  have  compared  it  to  an  approximate  upper  bound  on  the 
maximal  reward  collected.  Extensive  simulations  show  that  the  reward  collected  based  on 
the  CRH  algorithm  is  close  to  this  approximate  upper  bound  in  all  cases  considered, 
while  the  mission  completion  time  is  longer.  This  is  expected,  since  the  CRH  controller  is 
designed  to  hedge  against  uncertainty  by  not  following  straight  line  paths  to  targets. 


C.  Cooperative  Mission  Control  for  Adaptive  Sensor  Management 

In  this  work,  we  developed  cooperative  control  algorithms  for  surveillance  missions 
involving  teams  of  heterogeneous  UAVs  with  combinations  of  active  (tracking  and 
imaging  radar,  LADAR)  and  passive  (electro-optical,  infrared)  sensors  that  can  focus  on 
individual  objects  with  different  modes.  These  problems  were  motivated  by  sensors  such 
as  Global  Hawk,  Predators  and  other  UAVs  working  at  different  ranges  as  part  of  a 
layered  surveillance  network.  The  goal  is  to  coordinate  the  allocation  of  the  multiple 
sensor  resources,  ranging  from  trajectories  to  sensing  activities,  in  order  to  achieve  an 
accurate  representation  of  the  location  and  identity  of  objects  in  the  scenario,  while  doing 
this  in  an  adaptive  manner  that  exploits  previously  collected  information. 

In  one  of  our  approaches,  we  formulated  the  problem  for  controlling  a  set  of  sensors  with 
a  finite  number  of  sensing  options  and  finite  valued  measurements  that  were  tasked  to 
locate  and  classify  all  objects  in  an  area  of  interest,  as  accurately  as  possible  with  limited 
sensing  resources.  We  formulated  this  problem  as  a  Partially  Observed  Markov  Decision 
Problem  (POMDP),  with  a  combinatorially  large  state  space  and  action  space  [1-3]. 

Thus,  its  exact  solution  required  excessive  computation.  We  exploited  statistical 
conditional  independence  assumptions  of  measurements  to  approximate  the  original 
optimization  problem  by  a  convex  relaxation  of  this  problem  that  provided  a  lower  bound 
on  the  achievable  performance  [3,29].  We  developed  a  class  of  algorithms  techniques 


that  obtained  optimal  adaptive  solutions  to  this  lower  bound,  combining  techniques  from 
integer  programming  with  stochastic  dynamic  programming  algorithms  for  solution  of 
POMDPs.  Furthermore,  we  developed  control  algorithms  from  the  solutions  of  this 
approximate  problem  using  receding  horizon  controllers  in  a  model-predictive  approach 
[30].  The  resulting  controllers  provide  superior  performance  to  alternative  algorithms 
proposed  in  the  literature  and  obtain  solutions  to  large-scale  POMDP  problems  several 
orders  of  magnitude  faster  than  optimal  approaches.  Surprisingly,  the  performance  of  the 
receding  horizon  controllers  is  close  to  the  predicted  lower  bound  performance. 

In  our  work,  we  also  extended  our  initial  formulation,  which  focused  on  stationary 
objects,  to  scenarios  with  moving  objects.  We  used  Hidden  Markov  Models  (HMMs)  for 
the  evolution  of  objects,  according  to  the  dynamics  of  a  birth-death  process.  Along  the 
lines  of  our  previous  results,  we  developed  a  new  lower  bound  on  the  performance  of 
adaptive  controllers  in  these  scenarios,  associated  with  a  solution  of  a  large  POMDP 
problem,  developed  algorithms  for  computing  solutions  to  this  lower  bound  POMDP  that 
combine  integer  programming  with  stochastic  dynamic  programming.  These  algorithms 
can  be  used  as  before  in  a  receding  horizon  adaptive  control  for  sensor  allocation  in  the 
presence  of  moving  objects. 

We  also  consider  an  adaptive  search  problem  using  energy  allocation  where  sensing 
actions  are  continuous-valued  and  the  underlying  measurement  space  is  also  continuous. 
We  extended  our  previous  hierarchical  approach  based  on  performance  bounds  to  this 
problem  and  developed  novel  implementations  of  stochastic  dynamic  programming 
techniques  to  solve  this  problem.  Our  algorithms  are  nearly  two  orders  of  magnitude 
faster  than  previously  proposed  approaches  and  yield  solutions  of  comparable  quality 
[47]. 

Although  the  above  algorithms  are  significantly  faster  than  alternative  optimization 
approaches  using  dynamic  programming,  they  still  involve  on-line  solution  of  POMDPs 
in  a  hierarchical  manner,  which  can  be  a  major  limiting  factor  in  terms  of  computation 
requirements.  Another  limitation  was  that  the  observations  provided  by  sensors  had  to 
satisfy  a  conditional  independence  assumption  as  well  as  being  finite-valued,  which 
limited  the  applicability  of  the  results.  To  address  this  shortcoming,  we  developed  a  new 
mathematical  theory  for  adaptive  sensor  resource  allocation  that  models  sensors  as 
providing  observations  of  primitive  features  as  opposed  to  object  types.  In  this  approach, 
objects  are  modeled  as  spatially  related  collections  of  features,  characterized 
by  object  type  and  pose;  sensors  measure  noisy  projections  of  these  features  subject  to 
degradation  by  noise,  obscuration,  missed  detections  and  added  background  clutter. 

Using  techniques  based  on  random  sets,  we  developed  several  approaches  to  predict  the 
value  of  information  that  would  be  provided  by  potential  measurements.  A  promising 
approach  was  based  on  a  new  information-theoretic  performance  bound  where  much  of 
the  computations  can  be  pre-computed  off-line,  which  bounds  the  probability  of 
confusing  one  object  type  with  another  [46].  These  bounds  can  be  used  as  surrogate 
performance  measures  for  adaptive  resource  allocation  algorithms  that  can  scale  to  large 
numbers  of  objects  while  maintaining  real-time  performance.  Furthermore,  these 
algorithms  can  be  implemented  in  distributed  fashion  across  multiple  platforms,  using  the 


theories  of  distributed  assignment  algorithms.  In  simulations,  the  resulting  algorithms 
achieved  performance  comparable  to  on-line,  adaptive  sensor  management  algorithms 
with  much  reduced  computation  requirements. 


D.  Dynamic  algorithms  for  topology  selection  and  maintenance  in  airborne 
wireless  communication  networks 

An  important  problem  faced  by  air  vehicles  is  that  maintaining  an  air-to-air  connected 
communication  network  can  be  challenging  in  hostile  environments.  This  is  due  in  part 
to  the  use  of  directional  antennas  (to  increase  anti-jam  properties,  and  avoid  hostile 
exploitation  of  communications)  and  to  the  rapid  movement  of  air  vehicles.  The  problem 
of  choosing  a  network  topology  to  connect  communication  nodes  subject  to  specific 
objectives,  constraints,  and  properties  is  a  broadly  studied  problem,  with  approaches 
varying  widely  depending  on  the  hardware  capabilities  and  limitations,  the  required 
performance  criteria,  and  the  available  budget.  Our  work  was  motivated  by  a  topology 
problem  in  which  naval,  ground,  air,  and  space  vehicles  require  secure,  high  bandwidth 
data  communications  across  long  distances  in  an  unstable  environment.  Each  wireless 
connection  in  the  network  requires  a  pair  of  directional  antennae;  the  connection  is  point- 
to-point  line-of-sight,  not  broadcast  over  a  wider  region.  While  some  nodes  may  be 
stationary,  stable,  and  secure,  the  majority  of  the  hundreds  of  nodes  in  the  network  are 
moving.  As  a  result,  a  line-of-sight  connection  between  a  pair  of  nodes  is  subject  to 
predictable  and  unpredictable  interruption.  Because  the  network  is  constantly  evolving, 
new  topologies  must  be  generated  very  quickly  so  that  network  connectivity  is 
maintained. 

In  this  work,  we  developed  combinatorial  algorithms  to  solve  the  dynamic  connectivity 
problem.  We  define  this  problem  as  finding  a  minimum  degree-constrained 
spanning  tree.  We  developed  optimal  integer  programming  algorithms  that  can  generate 
the  topology  backbone  for  large  point-to-point  wireless  networks  quickly,  even  in 
networks  with  hundreds  of  nodes,  by  exploiting  quick  bounding  strategies,  and  using  a 
delayed  row  generation  algorithm.  We  also  extended  this  algorithm  to  variations  of  the 
topology  problem,  including  scenarios  where  the  antennae  in  the  network  consist  of 
multiple  incompatible  technologies,  and  where  the  connectivity  must  be  maintained  over 
time  with  costs  of  switching  links,  in  order  to  maximize  connectivity  over  a  time  horizon. 
The  resulting  algorithms  were  compared  with  efficient  alternatives  discussed  in  the 
literature,  and  were  shown  to  be  significantly  faster  and  more  robust  in  finding  optimal 
solutions.  Our  results  provide  algorithms  for  the  design  of  real-time  topology 
maintenance  algorithms,  as  well  as  performance  bounds  for  comparison  with  the 
performance  of  faster  heuristic  approaches  for  topology  maintenance. 

E.  Distributed  algorithms  for  nonlinear  resource  allocation 

Nonlinear  resource  allocation  problems  are  a  class  of  optimization  problems  where 
heterogeneous  resources  that  are  geographically  distributed  have  to  be  allocated  to  a 
diverse  set  of  tasks,  also  distributed  over  a  region  of  interest.  The  underlying 


performance  of  executing  a  task  is  a  nonlinear  function  of  the  bundle  of  resources 
assigned  to  it.  These  problems  are  motivated  by  diverse  applications  such  as  in  search 
theory,  weapon  target  assignment,  sensor  management,  market  equilibria,  production 
planning,  scheduling  of  mass  screening  tests  and  allocation  of  software-testing  resources. 
The  linear  cost  generalized  assignment  and  transportation  problems  can  be  seen  as  special 
cases  of  these  problems.  These  types  of  allocation  problems  arise  in  missions  associated 
with  unmanned  vehicles.  Our  focus  was  in  developing  solution  approaches  for  this 
problem  that  would  be  executed  in  a  distributed  manner  by  teams  of  unmanned  air 
vehicles. 

In  this  work,  we  developed  a  new  class  of  algorithms  for  distributed  nonlinear  resource 
allocation  based  on  nonlinear  extensions  of  the  auction  algorithm  for  linear  assignment 
problems,  called  RAP  Auction  [45].  This  algorithm  exploits  the  graph  structure  present 
in  RAPs,  plus  integrating  ideas  from  convex  and  combinatorial  optimization.  Unlike 
most  previous  techniques  for  this  class  of  problems,  it  is  applicable  to  arbitrary  convex 
monotonic  utilities,  thus  relaxing  assumptions  of  differentiability  and  strict  convexity. 
Furthermore,  the  algorithm  has  a  simple  computation  structure  that  is  amenable  to 
parallelization. 

We  developed  extensions  of  RAP  auction  and  resource- wise  optimization  algorithms  that 
are  suitable  for  distributed  computation,  and  established  convergence  of  the  algorithms  to 
correct  solutions  under  asynchronous,  unreliable  and  delayed  communications  with 
minimal  coordination  capabilities.  The  RAP  auction  is  a  primal  dual  approximate 
technique  with  finite  convergence,  while  the  resource- wise  optimization  algorithm  is  an 
exact  primal  algorithm  with  asymptotic  but  geometric  convergence.  Using  concepts  from 
asynchronous  optimization,  we  prove  that  the  algorithms  satisfy  critical  monotonicity 
properties  that  guarantee  convergence  to  optimal  solutions  under  totally  asynchronous 
implementations. 
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